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Abstract 

We introduce a new technique for proving kernelization lower bounds, called cross-composition. 
A classical problem L cross-composes into a parameterized problem Q if an instance of Q with 
polynomially bounded parameter value can express the logical OR of a sequence of instances of L. 
Building on work by Bodlaender et al. (ICALP 2008) and using a result by Fortnow and San- 
thanam (STOC 2008) we show that if an NP-hard problem cross-composes into a parameterized 
problem Q then Q does not admit a polynomial kernel unless the polynomial hierarchy collapses. 

Our technique generalizes and strengthens the recent techniques of using OR-composition 
algorithms and of transferring the lower bounds via polynomial parameter transformations. We 
show its applicability by proving kernelization lower bounds for a number of important graphs 
problems with structural (non-standard) parameterizations, e.g., Chromatic Number, Clique, 
and Weighted Feedback Vertex Set do not admit polynomial kernels with respect to the 
vertex cover number of the input graphs unless the polynomial hierarchy collapses, contrasting 
the fact that these problems are trivially fixed-parameter tractable for this parameter. We have 
similar lower bounds for Feedback Vertex Set. 
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1 Introduction 

Preprocessing and data reduction are important and widely applied concepts for speeding 
up polynomial-time algorithms or for making computation feasible at all in the case of 
hard problems that are not believed to have efficient algorithms. Kernelization is a way of 
formalizing data reduction, which allows for a formal analysis of the (im)possibility of data 
reduction and preprocessing. It originated as a technique to obtain fixed-parameter tractable 
algorithms for hard (parameterized) problems, and has evolved into its own topic of research 
(see [IH1I2] f° r recent surveys). A parameterized problem [THUS] is a language QCE'xN, the 
second component is called the parameter. A kernelization algorithm (kernel) transforms an 
instance (x, k) in polynomial time into an equivalent instance (x', k') such that \x'\, k' < f(k) 
for some computable function /, which is the size of the kernel. 

From a practical perspective we are particularly interested in cases where / € fc ' 1 ), 
so-called polynomial kernels. Success stories of kernelization include the 0(k 2 ) kernel for 
k- Vertex Cover containing at most 2k vertices [TT] and the meta-theorems for kernelization 
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of problems on planar graphs [3], among many others (cf. also |23j). Although researchers 
have looked for polynomial kernels for elusive problems such as fc-PATH for many years, it was 
only recently that techniques were introduced which make it possible to prove (under some 
complexity-theoretic assumption) that a parameterized problem in FPT does not admit a 
polynomial kernel. Bodlaender et al. [3] introduced the concept of a OR-composition algorithm 
as a tool to give super-polynomial lower bounds on kernel sizes. Consider some set S, and 
let OR(S) denote the set such that for any sequence x* := (xi, . . . , x t ) of instances of S we 
have x* € or(S) V*=i x i e S; then we could say that the language or(S) expresses the 
OR of instances of S. The approach taken in the original paper by Bodlaender et al. [3] uses 
a theorem by Fortnow and Santhanam [T7j to show that if there is a polynomial-time OR- 
composition algorithm that maps any sequence of instances (xi, k), (X2, k), . . . , (xt, k) of some 
parameterized problem Q which all share the same parameter value to an instance {x* , k*) 
of Q which acts as the OR of the inputs and k* £ then Q does not admit a polynomial 

kernel unless NP C coNP/poly. This machinery made it possible to prove e.g. that fc- 
PATH and the Clique problem parameterized by the treewidth of the graph do not admit 
polynomial kernels unless NP C coNP/poly 1 . The latter is deemed unlikely since it is known 
to imply a collapse of the polynomial hierarchy to its third level [25] (and further [5]). 

It did not take long before the techniques of Bodlaender et al. were combined with the 
notion of a polynomial parameter transformation to also prove lower bounds for problems 
for which no direct OR-composition algorithm could be found. This idea was used implicitly 
by Fernau et al. [TS] to show that /c-Leaf Out-Branching does not admit a polynomial 
kernel, and was formalized in a paper by Bodlaender et al. [6 : they showed that if there is 
a polynomial-time transformation from P to Q which incurs only a polynomial blow-up in 
the parameter size, then if P does not admit a polynomial kernel then Q does not admit 
one either. These polynomial parameter transformations were used extensively by Dom et 
al. |13j who proved kernelization lower bounds for a multitude of important parameterized 
problems such as Small Universe Hitting Set and Small Universe Set Cover. Dell 
and van Melkebeek [T2] were able to extend the techniques of Fortnow and Santhanam to 
prove, e.g., that Vertex Cover does not admit a kernel of size 0(fc 2_e ) for any e > 0. 

Our results. We introduce a new technique to prove kernelization lower-bounds, which 
we call cross-composition. This technique generalizes and strengthens the earlier methods of 
OR-composition [3J and polynomial- parameter transformations [B], and puts the two existing 
methods of showing kernelization lower bounds in a common perspective. Whereas the 
existing notion of OR-composition works by composing multiple instances of a parameterized 
problem Q into a single instance of Q with a bounded parameter value, for our new technique 
it is sufficient to compose the OR of any classical NP-hard problem into an instance of the 
parameterized problem Q for which we want to prove a lower-bound. The term cross in 
the name stems from this fact: the source- and target problem of the composition need no 
longer be the same. Since the input to a cross-composition algorithm is a list of classical 
instances instead of parameterized instances, the inputs do not have a parameter in which 
the output parameter of the composition must be bounded; instead we require that the size 
of the output parameter is polynomially bounded in the size of the largest input instance. In 
addition we show that the output parameter may depend polynomially on the logarithm of 
the number of input instances, which often simplifies the constructions and proofs. We also 
introduce the concept of a polynomial equivalence relation to remove the need for padding 



1 In the remainder of this introduction we assume that NP g coNP /poly when stating kernelization lower 
bounds. 
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Problem name 


Parameter 


Kernel size 


Clique 


vertex cover 


not polynomial 


[Section 


4.1 


Chromatic Number 


vertex cover 


not polynomial 


[Section 


4.2 


Feedback Vertex Set 


dist. from cluster 


not polynomial 


[Section 


4.3 


Feedback Vertex Set 


dist. from co-cluster 


not polynomial 


[Section 


4.3 


Weighted FVS 


vertex cover 


not polynomial 


[Section 


4.3 



Table 1 An overview of the kernelization lower bounds obtained in this paper; all listed 
problems are fixed-parameter tractable with respect to this parameterization. Section [4] describes 
the parameterized problems in more detail. 



arguments which were frequently required for OR-compositions. 

To show the power of cross-composition we give kernelization lower bounds for structural 
parameterizations of several important graph problems. Since many combinatorial problems 
are easy on graphs of bounded treewidth [5 , and since the treewidth of a graph is bounded by 
the vertex cover number, it is often thought that almost all problems become tractable when 
parameterized by the vertex cover number of the graph. We show that this is not the case 
for kernelization: Clique, Chromatic Number and Weighted Feedback Vertex Set 
do not admit polynomial kernels parameterized by the vertex cover number of the graph. In 
the case of Clique it was already known [3] that the problem does not admit a polynomial 
kernel parameterized by the treewidth of the graph; since the vertex cover number is at 
least as large as the treewidth we prove a stronger result. For the unweighted Feedback 
Vertex Set problem, which admits a polynomial kernel parameterized by the target size of 
the feedback set |24j , we show that there is no polynomial kernel for the parameterization by 
deletion distance to cluster graphs or co-cluster graphs. 

Organization. The paper is organized as follows. We first give some preliminary 
definitions. Section [3] gives the formal definition of cross-composition, and proves that 
cross-compositions allow us to give kernelization lower bounds. In Section [4] we apply the 
new technique to obtain kernelization lower bounds for various problems. 

2 Preliminaries 

In this work we only consider undirected, finite, simple graphs. Let G be a graph and denote 
its vertex set by V(G) and the edge set by E{G). We use x{G) to denote the chromatic 
number of G. If V C V(G) then G[V) denotes the subgraph of G induced by V . A graph 
is a cluster graph if every connected component is a clique. A graph is a co-cluster graph 
if it is the edge-complement of a cluster graph. Throughout this work we use £ to denote 
a finite alphabet, but note that multiple occurrences of £ may refer to different alphabets. 
For positive integers n we define [n] := {1, . . . , n}. The satisfiability problem for boolean 
formulae is referred to as SAT. For completeness we give the following core definitions of 
parameterized complexity [3J I14j . 

► Definition 1. A parameterized problem is a language Q C S* x N, and is contained in the 
class (strongly uniform) FPT (for Fixed-Parameter Tractable) if there is an algorithm that 
decides whether (x,k) € Q in f(k)\x\°^ time for some computable function /. 

► Definition 2. A kernelization algorithm |19| 12]. or in short, a kernel for a parameterized 
problem Q C S* x N is an algorithm that given (x, k) € S* x N outputs in p[\x\ + k) time a 
pair (x', k') e X* x N such that: 
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(x, k) G Q (x 1 , k') G Q, 
. |x'|,fc'</(fc), 

where / is a computable function, and p a polynomial. Any function / as above is referred 
to as the size of the kernel; if / is a polynomial then we have a polynomial kernel. 

3 Cross-Composition 

3.1 The Definition 

In this section we define the concept of cross-composition and give all the terminology needed 
to apply the technique. 

► Definition 3 (Polynomial equivalence relation). An equivalence relation 7Z on E* is called a 
polynomial equivalence relation if the following two conditions hold: 

1. There is an algorithm that given two strings x, y G E* decides whether x and y belong to 
the same equivalence class in (|x| + lyl) ^ 1 - 1 time. 

2. For any finite set SCE* the equivalence relation 1Z partitions the elements of S into at 
most (max l£ s Ixl) ^ 1 ' classes. 

► Definition 4 (Cross-composition). Let L C E* be a set and let Q C E* x N be a parame- 
terized problem. We say that L cross-composes into Q if there is a polynomial equivalence 
relation 1Z and an algorithm which, given t strings xi, X2, . . . , Xt belonging to the same equiv- 
alence class of 1Z, computes an instance (x*, k*) G E* x N in time polynomial in Y^i=i \ x i\ 
such that: 

1. (x*,k*) G Q <^ Xi G L for some 1 < i < t, 

2. k* is bounded by a polynomial in max' =1 \xi\ + logi. 

3.2 How Cross-compositions Imply Lower Bounds 

The purpose of this section is to prove that cross-compositions imply kernelization lower 
bounds. To give this proof we need some concepts from earlier work [5| 117) [T^j. 

► Definition 5 ([17J). A weak distillation of SAT into a set L C E* is an algorithm that: 
h receives as input a sequence (xi, . . . , Xt) of instances of SAT, 

uses time polynomial in Y^\=i \ x i\i 
h and outputs a string y G E* with 

1. y G L Xi G SAT for some 1 < i < t, 

2. \y\ is bounded by a polynomial in max* =1 \xi\. 

► Theorem 6 (Theorem 1.2 [17]). If there is a weak distillation of SAT into any set LCE' 
then NP C coNP/poly and the polynomial-time hierarchy collapses to the third level (PH = 

► Definition 7 ([12]). The OR of a language L C E* is the set OR.(L) that consists of all 
tuples (xi, . . . , Xt) for which there is an index 1 < i < t with e L. 

► Definition 8 (|3J). We associate an instance (x, k) of a parameterized problem with the 
unparameterized instance formed by the string x#l , where # denotes a new character that 
we add to the alphabet and 1 is an arbitrary letter in E. The unparameterized version of a 
parameterized problem Q is the language Q = {x#l fe | (x, k) G Q}. 
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► Theorem 9. Let ICS* be a set which is NP-hard under Karp reductions. If L cross- 
composes into the parameterized problem Q and Q has a polynomial kernel then there is a 
weak distillation of SAT into OR(Q) and NP C coNP/poly. 

Proof. The proof is by construction and generalizes the concepts of Bodlaender et al. [3]. 
Assuming the conditions in the statement of the theorem hold, we show how to build an 
algorithm which distills SAT into OR(Q). By the definition of cross-composition there is a 
polynomial equivalence relation 1Z and an algorithm C which composes L-instances belonging 
to the same class of TZ into a Q-instance. 

The input to the distillation algorithm consists of a sequence (x\, . . . , x t ) of instances of 
SAT, which we may assume are elements of £*. Define m := max* =1 \xj\. lit > (|S| + l) m then 
there must be duplicate inputs, since the number of distinct inputs of length to' < to is |E| m . 
By discarding duplicates we may therefore assume that t < (|E| + l) m , i.e., logf € 0(m). By 
the assumption that L is NP-hard under Karp reductions, there is a polynomial-time reduction 
from SAT to L. We use this reduction to transform each SAT instance Xi for 1 < i < t into an 
equivalent L-instance yi. Since the transformation takes polynomial time, it cannot increase 
the size of an instance by more than a polynomial factor and therefore \yi\ is polynomial 
in m for all i. 

The algorithm now pairwise compares instances using the polynomial-time equivalence test 
of TZ (whose existence is guaranteed by Definition |3| to partition the I-instances (y%, . . . ,y t ) 
into partite sets Y\ , . . . , Y r such that all instances from the same partite set are equivalent 
under 1Z. The properties of a polynomial equivalence relation guarantee that r is polynomial 
in to and that this partitioning step takes polynomial time in the total input size. 

We now use the cross-composition algorithm C on each of the partite sets Y\, . . . ,Y r , 
which is possible since all instances from the same set are equivalent under 1Z. Let (zi, ki) be 
the result of applying C to a sequence containing the contents of the set Yi, for 1 < i < r. 
From the definition of cross-composition and using \ogt £ 0(m) it follows that each fcj is 
polynomial in to, and that the computation of these parameterized instances takes polynomial 
time in the total input size. From Definition [1] it follows that (zi, ki) is a yes instance of Q 
if and only if one of the instances in Yi is a yes instance of L, which in turn happens if and 
only if one of the inputs Xi is a yes instance of SAT. 

Let if be a polynomial kernelization algorithm for Q, whose existence we assumed in 
the statement of the theorem. We apply K to the instance (zi, fcj) to obtain an equivalent 
instance (z[, k[) of Q for each 1 < i < r. Since if is a polynomial kernelization we know 
that these transformations can be carried out in polynomial time and that \z^\, < k^ X \ 
Since fcj is polynomial in to it follows that \z[\ and k[ are also polynomial in to for 1 < i < r. 

As the next step we convert each parameterized instance (z[, k[) to the unparameterized 
variant £j := z[^-\ ki . Since the values of the parameters are polynomial in to this trans- 
formation takes polynomial time, and afterwards we find that \z~i\ is polynomial in to for 
each 1 < i < r. 

The last stage of the algorithm simply combines all unparameterized variants into one 
tuple x* :— (z~i, z~2, ■ ■ ■ , z r ). Since the size of each component is polynomial in to, and 
since the number of components r is polynomial in to, we have that |x*| is polynomial 
in to. The tuple x* forms an instance of OR(Q), and by the definition of Or(Q) we know 
that x* € Ok(Q) if and only if some element of the tuple is contained in Q. By tracing back 
the series of equivalences we therefore find that x* € Or(Q) if and only if some input Xi is a 
YES-instance of SAT. Since we can construct x* in polynomial time and |x*| is polynomial 
in to, we have constructed a weak distillation of SAT into Or(Q). By Theorem [H] this implies 
NP C coNP/poly and proves the theorem. 
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► Corollary 10. If some set L is NP-hard under Karp reductions and L cross-composes into 
the parameterized problem Q then there is no polynomial kernel for Q unless NP C coNP/poly. 

A simple extension of Theorem [9] shows that cross-compositions also exclude the possibility 
of compression into a small instance of a different parameterized problem, a notion sometimes 
referred to as bikernelization [201 [21] ■ If an NP-hard set cross- composes into a parameterized 
problem Q, then unless NP C coNP/poly there is no polynomial-time algorithm that maps 
an instance [x, k) of Q to an equivalent instance (a;', k') of any parameterized problem P 
with \x'\,k' < fc°W. 



4 Results Based on Cross-Composition 

In this section we apply the cross-composition technique to give kernelization lower bounds. 
We consider the problems Feedback Vertex Set, Chromatic Number and Clique 
under various parameterizations. The first parameter we consider is the vertex cover number 
of a graph G, i.e. the cardinality of a smallest set of vertices Z C V(G) such that all edges 
of G have at least one endpoint in Z. We show that Clique, Chromatic Number and 
Weighted Feedback Vertex Set do not admit polynomial kernels parameterized by the 
size of a vertex cover unless NP C coNP/poly. 

We could also define the vertex cover number as the minimum number of vertex deletions 
needed to reduce a graph to an edgeless graph; hence the vertex cover number measures 
how far a graph is from being edgeless. Following the initiative of Cai [5] we may similarly 
define the deletion distance of a graph G to a (co-)cluster graph as the minimum number 
of vertices that have to be deleted from G to turn it into a (co-)cluster graph. Since (co-)- 
cluster graphs have a very restricted structure, one would expect that a parameterization 
by (co-) cluster deletion distance leads to fixed-parameter tractability; indeed this is the 
case for many problems, since graphs of bounded (co-)cluster deletion distance also have 



bounded cliquewidth (Lemma 19 1. For the Feedback Vertex Set problem, which admits 
a polynomial kernel parameterized by the target size and hence by the vertex cover number, 
we show that the parameterizations by cluster deletion or co-cluster deletion distance do not 
admit polynomial kernels. 

In Table[2]we give the known results for our subject problems with respect to the standard 
parameterization, which refers to the solution size. Since the problems we study are very 
well-known, we do not give a full definition for each one. Instead we give an educative 
example of how the parameter is reflected in an instance. 

Chromatic Number parameterized by the size of a vertex cover 
Instance: A graph G, a vertex cover Z C V(G), and a positive integer £. 
Parameter: The size k := \Z\ of the vertex cover. 
Question: Is x(G) < £, i.e., can G be colored with at most I colors? 

For technical reasons we supply a vertex cover in the input of the problem, to ensure that 
we 11- formed instances can be recognized in polynomial time. The parameter to the problem 
claims a bound on the vertex cover number of the graph, and using the set Z we may verify 
this bound. For Feedback Vertex Set parameterized by deletion distance to cluster 
graphs or co-cluster graphs, we also supply the deletion set in the input. These versions of 
the problem are certainly no harder to kernelize than the versions where a deletion set or 
vertex cover is not given. 
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Problem name 


Parameter 


Param. complexity 


Kernel size 


Clique 


clique 


W[l]-hard [H] 


W[l]-hard 


na 


Feedback Vertex Set 


feedback vertex set 


FPT [TO] 


4fc 2 vertices 




Chromatic Number 


chromatic number 


NP-h for k £ O(l) 


NP-h for k £ 


O(l) 



Table 2 Parameterized complexity and kernel size for some of the problems considered in this 



paper, with respect to the standard parameterization (i.e., target size). 



4.1 Clique parameterized by Vertex Cover 

An instance of the NP-complete Clique problem [THJ GT19] is a tuple (G,£) and asks 
whether the graph G contains a clique on £ vertices. We use this problem for our first 
kernelization lower bound. 

► Theorem 11. Clique 'parameterized by the size of a vertex cover does not admit a 
polynomial kernel unless NP C coNP/poly. 

Proof. We prove the theorem by showing that Clique cross- composes into Clique parame- 
terized by vertex cover; by Corollary [10] this is sufficient to establish the claim. We define 
a polynomial equivalence relation 1Z such that all bitstrings which do not encode a valid 
instance of Clique are equivalent, and two well-formed instances (Gi,£i) and (£2,^2) are 
equivalent if and only if they satisfy |V(Gi)| = |V(G2)| and l\ = £2- From this definition 
it follows that any set of well-formed instances on at most n vertices each is partitioned 
into 0(n 2 ) equivalence classes. Since all malformed instances are in one class, this proves 
that 1Z is indeed a polynomial equivalence relation. 

We now give a cross-composition algorithm which composes t input instances X\, . . . , Xt 
which are equivalent under 1Z into a single instance of Clique parameterized by vertex 
cover. If the input instances are malformed or the size of the clique that is asked for 
exceeds the number of vertices in the graph, then we may output a single constant-size NO 
instance; hence in the remainder we may assume that all inputs are well-formed and encode 
structures {G\,t), . . . , (Gt,£) such that | V(Gi) | = n for all i e [t] and all instances agree on 
the value of £, which is at most n. We construct a single instance (G', Z' , £' , k') of Clique 
parameterized by vertex cover, which consists of a graph G' with vertex cover Z' C V(G') of 
size k' and an integer £' . 

Let the vertices in each Gi be numbered arbitrarily from 1 to n. We construct the 
graph G' as follows (see also Figure [TJ: 

1. Create £n vertices Vij with i G [£] and j G [ri]. Connect two vertices v^j and v^i y if i 7^ i' 
and j ^ f. Let C denote the set of these vertices. It is crucial that any clique in G' can 
only contain one vertex Vi. or v.j for each choice of i £ [£] respectively j € [n\. Thus any 
clique contains at most £ vertices from G. 

2. For each pair l<p<q<noi distinct vertices from [n] (i.e., vertices of graphs Gi), 
create three vertices: w Ptq , W Pl $, and wp^ q and make them adjacent to G as follows: 

a. Wp.q is adjacent to all vertices from G, 

b. Wp.q is adjacent to all vertices from G except for v.j with j = q, and 

c. wp tq is adjacent to all vertices from G except for v. j with j = p. 

Furthermore we add all edges between vertices w v that correspond to distinct pairs 
from [n]. Let D denote these 3(2) vertices. Any clique can contain at most one w. t . 
vertex for each pair from [n] . 
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Figure 1 A sketch of the construction used in the proof of Theorem 
an examplary way how vertices w Pt q, w Pt q, and Wp, 9 
is an edge of Gi but not of Gj . 



11 The dashed edges show in 
are connected to vertices of B and C, e.g., {p, q} 



3. For each instance Xi with graph Gi make a new vertex m and connect it to all vertices 
in C. The adjacency to D is as follows: 

a. Make Ui adjacent to w p _ q if {p, q} is an edge in Gi. 

b. Otherwise make Ui adjacent to w p ,q and wp jq . 

Let B denote this set of t vertices. 

We define d := i + 1 + (z). Furthermore, we let Z' := C U D which is easily verified to 
be a vertex cover for G' of size k' := \Z'\ = in + 3( J 2 l ). The value kl is the parameter to 
the problem, which is polynomial in n and hence in the size of the largest input instance. 
The cross-composition outputs the instance x' := (G", Z',£', k'). It is easy to see that our 
construction of G' can performed in polynomial time. Let us now argue that x' is yes if and 
only if at least one of the instances Xi is YES. 

(<=) First we will assume that some Xi- is yes, i.e., that Gi* contains a clique on at 
least I vertices. Let S C [n] denote a clique of size exactly t in Gi*. We will construct a 
set S' of size £' = I + 1 + (!£) and show that it is a clique in G': 

1. We add the vertex m* to S' . 

2. Let S = {pi, . . . ,pi} C [n]. For each pj in S we add the vertex Vj tPj to S". By Step [l] all 
these vertices are pairwise adjacent, and by Step [3] they are adjacent to Uj«. 

3. For each pair 1 < p < q < n there are two cases: 

a. If {p, q} is an edge of Gi* then the vertex tij» is adjacent to w Ptq in G' (by Step|3]) 
and w Ptq is adjacent to all vertices of C (by Step|2]). We add w p ^ q to 5". 

b. Otherwise the vertex Ui* is adjacent to both w Piq - and wp tq . Since the clique S cannot 
contain both p and q when {p, q} is a non-edge we are able to add w Ptq respectively w p _ q 
to S'; recall that, e.g., w p ^ q is adjacent to all vertices of C except those corresponding 
to q. 

In both cases we add one w. r -vertex to S", each corresponding to a different pair p, q; all 
these vertices are pairwise adjacent by Step [2] 
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We have identified the clique S' in G' of size £' = 1+ 1+ (if), proving that x' is a yes- instance. 

Now assume that x' is a YES-instance and let S' be a clique of size £ + 1 + Q) 
in G'. Since 5" contains at most ^ vertices from G (i.e., one Vi r for each z G [£}) and at 
most (2) vertices from D it must contain at least one vertex from B, say Ui* £ _B. Since -B 
is an independent set the set S' must contain exactly £ vertices from C and exactly Q) 
vertices from D. Let S = {j € [n] | Vij € S" for some z € [^]}. The set S has size ^ since S' 
contains at most one vertex v.j for each j € \n\. We will now argue that S is a clique in Gi*. 
Let p,q <E S. The clique S 1 must contain a w.^. -vertex corresponding to {p, q} and it must 
contain vertices Vi iP and for some £ [^]. Therefore it must contain w Ptq since w Ptq 
has no edges to vertices v. t q and has no edges to w. ;P by Step [2] Thus Ui* <E 5' must be 
adjacent to w p ^ q which implies that G^* contains the edge {p, q}. Thus S is a clique in G^*. 

Since we proved that the instance (G', Z' , fc') can be constructed in polynomial-time 
and that it acts as the OR of the input instances, and because the parameter value k' 
is bounded by a polynomial in the size of the largest input instance, this concludes the 
cross-composition proof and establishes the claim. 

► Corollary 12. If J 7 is a class of graphs containing all cliques, then Vertex Cover and 
Independent Set parameterized by the minimum number of vertex deletions to obtain a 
graph in J- do not admit polynomial kernels unless NP C coNP/poly. In particular, Vertex 
Cover and Independent Set parameterized by co-cluster deletion distance or cluster 
deletion distance do not admit polynomial kernels unless NP C coNP/poly. < 

4.2 Chromatic Number parameterized by Vertex Cover 

In this section we give a kernelization lower bound for Chromatic Number parameterized 
by vertex cover, through the use of a restricted version of 3-Coloring. 

► Definition 13. A graph G is a triangle split graph if V(G) can be partitioned into sets X, Y 
such that G[X] is an edgeless graph and G[Y] is a disjoint union of vertex-disjoint triangles. 

An instance of the classical problem 3-Coloring with Triangle Split Decomposi- 
tion is a tuple (G, X, Y) consisting of a graph G and a partition of its vertex set into X U Y 
such that G[X] is edgeless and G[Y] is a union of vertex-disjoint triangles. The question is 
whether G has a proper 3-coloring. The following lemma shows that this restricted form of 
the problem is NP-complete, which is proven by replacing all edges in a normal instance of 
3-Coloring with a triangle. The proof is deferred to the appendix due to space restrictions. 

► Lemma 14. 3-Coloring with Triangle Split Decomposition is NP-complete. < 

► Theorem 15. Chromatic Number parameterized by the size of a vertex cover does not 
admit a polynomial kernel unless NP C coNP/poly. 

Proof. To prove the theorem we will show that 3-Coloring with Triangle Split De- 
composition cross-composes into Chromatic Number parameterized by a vertex cover 
of the graph. By a suitable choice of polynomial equivalence relation in the same style 
as in Theorem |11| we may assume that we are given t input instances which encode struc- 
tures (Gi, Xi, Yi), . . . , (G t ,X t ,Y t ) of 3-Coloring with Triangle Split Decomposition 
with \Xi\ = n and \Yi\ — 3m for all i G [t] (i.e., m is the number of triangles in each instance). 
We will compose these instances into one instance (G', Z',£', k 1 ) of Chromatic Number 
parameterized by vertex cover. By duplicating some instances we may assume that the 
number of inputs t is a power of 2; this only increases the input size by a factor of at most 2, 
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and hence any bounds which are polynomial in the old input size will be polynomial in the 
new input size which is sufficient for our purposes. 

For each set Yi, label the triangles in Gj[li] as T±, . . . , T m in some arbitrary way, and label 
the vertices in each triangle Tj for a set Yi as a\ , b\ , <£ . We build a graph G' with a vertex 
cover of size k! := 31ogt + 4 + 3m G 0(rn + logt) such that G' can be £' := logt + 4-colored 
if and only if one of the input instances can be 3-colored. 

1. Create a clique on vertices {pi \ i £ [logt]} U {w,x,y, z}; it is called the palette. 

2. Add the vertices Ui=i *° the graph, and make them adjacent to the vertex w. 

3. For i £ [m] add a triangle T* to the graph on vertices {ai,bi,Ci}. The union of these 
triangles will be the triangle vertices T* . Make all vertices in T* adjacent to all vertices 
from the set {pi | i £ [logt]} U {w}. 

4. For i £ [logt] add a path on two new vertices {qb,q\} to the graph, and make them 
adjacent to all vertices ({pj | j £ [logt]} U {x,y, z}) \ {pi}. These vertices form the 
instance selector vertices. 

5. For each instance number i £ [t] consider the binary representation of the value i, 
which can be expressed in logt bits. Consider each position j £ [logt] of this binary 
representation, where position 1 is most significant and logt is least significant. If bit 
number j of the representation of i is a (resp. a 1) then make vertex q J (resp. q{) 
adjacent to all vertices of X{. (We identify t by the all-zero string ... 0.) 

6. As the final step we re-encode the adjacencies between vertices in the independent sets X 4 
and the triangles into our graph G' . For each i £ [t], for each vertex v £ Yi, do the 
following. If v is adjacent in Gi to vertex a\ then make vertex v adjacent in G' to aj. Do 
the same for adjacencies of v to b? and c\. 

This concludes the construction. The following claims about G' are easy to verify: 

(I) In every proper if = logt + 4-coloring of G', the following must hold: 

a. each of the logt + 4 vertices of the palette clique receives a unique color, 

b. consider some i £ [logt]: the vertices and q\ receive different colors (since they 
are adjacent), one of them must take the color of w and the other of pi (they are 
adjacent to all other vertices of the palette), 

c. the triangle vertices T* are colored using the colors of x, y, z (they are adjacent to 
all other vertices of the palette) , 

d. the only colors which can occur on a vertex in (for all i £ [t]) are the colors 
given to x, y, z and {pj | j £ [logt]} (since the vertices in Xj are adjacent to w). 

(II) For every i £ [t], the graph G'[X{ U T*} is isomorphic to Gi. 

(Ill) The set Z' := {p t \ i £ [log t]} U {w, x, y, z} U T* U {q^, q\ \ i £ [log t]} forms a vertex 
cover of G' of size k' = \Z'\ = 3 logt + 4+ 3m. Hence we establish that G' has a vertex 
cover of size 0(m + log t). 

Due to space restrictions we cannot give the full correctness proof for the transformation. 

Using the given properties of G' one may verify that x(G') < log t + 4 3i £ [t] : x{Gi) < 3. 

The full proof is in the appendix as Lemma |22| A 

For every fixed integer q, the ^-Coloring problem parameterized by the vertex cover num- 
ber does admit a polynomial kernel. Kernelization algorithms for structural parameterizations 
of the ^-Coloring problem will be the topic of a future publication. 
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4.3 Kernelization lower bounds for Feedback Vertex Set 

In this section we give several kernelization lower bounds for Feedback Vertex Set. Due 
to space constraints the proofs are deferred to the appendix. 

► Theorem 16. Feedback Vertex Set parameterized by deletion distance to co-cluster 
graphs does not admit a polynomial kernel unless NP C coNP/poly. < 

► Theorem 17. Feedback Vertex Set parameterized by deletion distance to cluster 
graphs does not admit a polynomial kernel unless NP C coNP / poly. A 

► Theorem 18. Weighted Feedback Vertex Set, where each vertex is given a positive 
integer as its weight, does not admit a polynomial kernel parameterized by the size of a vertex 
cover unless NP C coNP/poly. -4 

5 Conclusions 

We have introduced the technique of cross-composition and used it to derive kernelization 
lower bounds for structural parameterizations of several graph problems. Since we expect 
that cross-composition will be a fruitful tool in the further study of kernelization lower 
bounds, we give some pointers on how to devise cross-composition constructions. As the 
source problem of the composition one may choose a restricted yet NP-hard version of the 
target problem; this brings down the richness of the instances that need to be composed. If 
the goal is to give a lower bound for a structural parameterization (such as the size of a vertex 
cover) then starting from a problem on graphs which decompose into an independent set and 
some very structured remainder (e.g. triangle split graphs decompose into an independent 
set and vertex-disjoint triangles) it may be possible to compose the instances by taking 
the disjoint union of the inputs, and one-by-one identifying the vertices in the structured 
remainder. The fact that cross-compositions allow the output parameter to be polynomial 
in the size of the largest input can also be exploited, e.g., the proof of Theorem |11| uses 
this when composing input instances on n vertices into a graph G'\ we create vertices 
inside a vertex cover Z' for G' , and the adjacencies between Z' and a single vertex outside 
the cover represent the entire adjacency structure of an input graph. 

Cross-composition is also appealing from a methodological point of view, since it gives a 
unified way of interpreting the two earlier techniques for proving kernelization lower bounds: 
OR-compositions and polynomial-parameter transformations can both be seen to yield cross- 
compositions for a problem. For OR-composition this is trivial to see since an OR-composition 
for problem Q just shows that the unparameterized variant Q cross-composes into Q. The 
combination of an OR-composition for problem P and a polynomial-parameter transform 
from P to Q also gives a cross-composition: first applying the OR-composition on instances 
of P and then transforming the resulting P-instance to a CJ-instance effectively shows that 
we can cross-compose instances of the unparameterized variant P into instances of Q. Hence 
the cross-composition technique puts the existing methods of showing super-polynomial 
kernelization lower bounds in a common framework, and also explains why these problems 
do not admit polynomial kernels: a parameterized problem P does not admit a polynomial 
kernel if it can encode the OR of some NP-hard problem for a sufficiently small parameter 
value. This new perspective might lead to a deeper insight into the common structure of 
FPT problems without polynomial kernels. 
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A Parameterized complexity of cluster and co-cluster deletion 
parameters 

In this section we briefly show that Feedback Vertex Set is in FPT parameterized by 
cluster deletion or co-cluster deletion distance, through an argument about cliquewidth |22j . 
The following proposition about cliquewidth is folklore. 

► Proposition 1. A cluster graph has clique-width 1. 
We also use two results from Allen, Lozin and Rao [I]. 

► Proposition 2. For any graph G it holds that CLIQUEWIDTh(G) < 2 CLIQUEWIDTH(G), 
where G is the edge-complement of G. 

► Proposition 3. If a graph G is obtained from a graph H by deleting k vertices, then 

CLIQUEWIDTH(G) < CLIQUEWIDTH (if) < 2 fe (CLIQUEWIDTH(G) + 1). 

These propositions allow us to relate the parameter "deletion distance to a (co-)cluster 
graph" to the cliquewidth of a graph. 

► Lemma 19. If graph H can be turned into a cluster graph or co-cluster graph by k vertex 
deletions, then cliquewidth (G) < 3 • 2 k . 

Proof. By Proposition [TJ and Proposition [2] the cliquewidth of cluster graphs is 1, and the 
clique- width of co-cluster graphs is at most two. Assume H can be turned into a cluster graph 
or co-cluster graph G by exactly k vertex deletions. Then CLIQUEWIDTh(G) < 2 and from 
Proposition [3] it follows that CLIQUEWIDTH (TJ) < 2 fc (CLiQUEWiDTH(G) + 1) < 2 k ■ 3. A 

Lemma |19| shows that graphs of bounded cluster graph deletion number or co-cluster 
graph deletion number, also have bounded cliquewidth. Since the Feedback Vertex Set 
problem can be solved in FPT-time on graphs of bounded cliquewidth [7] , this shows that 
Feedback Vertex Set is in FPT when parameterized by cluster deletion distance or 
co-cluster deletion distance. 

B Omitted proofs 
B.l Proof of Corollary [T2] 

Proof of Corollary |12| Consider an instance (G, Z, £, k) of Clique parameterized by the 
size of a vertex cover. Since a clique in G is an independent set in G, the Clique instance 
is equivalent to asking whether the graph G has an independent set of size at least £. 
Because Z is a vertex cover for G we know that G — Z is an independent set, and there- 
fore G — Z is a clique. Hence if we use a parameter "deletion distance from a complete 
graph" which measures how many vertex deletions are needed to obtain a complete graph, 
then the instance (G, Z, £, k) of Clique parameterized by vertex cover is equivalent to an 
instance (G, Z, £, k) of Independent Set parameterized by the size of the set Z whose 
deletion from G leaves a complete graph. Since G has an independent set of size £ if and only 
if it has a vertex cover of size | V(G)| — £ it follows that these two instances are also equivalent 
to the instance (G, Z, | V(G)| — £, k) of Vertex Cover parameterized by a deletion set Z to 
a complete graph. 

Since the proof of Theorem [TT] shows that instances of Clique cross-compose into an 
instance (G,Z,£,k) of Clique parameterized by vertex cover, and since this instance is 
equivalent to instance (G,Z,£,k) of Independent Set parameterized by deletion distance 
to complete graphs and instance (G, Z,\V(G) \ ~£,k) of Vertex Cover parameterized 
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by deletion distance to complete graphs, this proves that Clique cross-composes into the 
latter two parameterized problems and hence they do not admit polynomial kernels unless 
NP C coNP/poly. 

Let J 7 be a class of graphs containing all complete graphs. Then the minimum number of 
vertex deletions needed to transform a graph G into a graph in T is at most the number of 
vertex deletions needed to turn G into a complete graph. Hence the parameter "deletion 
distance to a graph in J 7 " is not larger than the parameter "deletion distance to complete 
graphs", and therefore Independent Set and Vertex Cover do not have a polynomial 
kernel for the parameter deletion distance to J- . Since the classes of cluster graphs and 
co-cluster graphs contain all cliques, this proves all claims in the corollary. -4 

B.2 Proofs for Chromatic Number parameterized by Vertex Cover 

An odd cycle is a simple cycle on an odd number of > 3 vertices. An odd wheel is the 
graph which is obtained from an odd cycle by adding a new vertex which is adjacent to all 
other vertices. The vertices on the odd cycle become the rim of the wheel, whereas the new 
universal vertex is the hub of the wheel. The following proposition about coloring odd wheels 
can be found in any standard text book on graph theory. 

► Proposition 4. An odd wheel is not 3-colorable. 

► Lemma 20. Let G be a graph and let u and v be distinct non-adjacent vertices in G such 
that G[Ng({u, v})] contains an odd cycle. Then u and v must receive different colors in a 
proper 3-coloring of G. 

Proof. Proof by contradiction. Assume there is a proper 3-coloring of G where u and v 
receive the same color. The coloring is still proper if we identify the vertices u and v into a 
single vertex z which takes the same color as u and v (discarding parallel edges that might 
arise). After the transformation this new vertex z is adjacent to all vertices in Ng({u,v}). 
Since we assumed G[Ng({u, v})) contains an odd cycle, all vertices of this odd cycle are 
adjacent to z after merging u and v. But this shows that in the transformed graph z forms 
the hub of an odd wheel with the vertices on the odd cycle as the rim. By Proposition [4] a 
graph containing an odd wheel cannot be 3-colored, which is a contradiction to the 3-coloring 
we extracted from the assumption that G is 3-colored with the same color for u and v; this 
proves the claim. -4 

► Lemma 21. 3-Coloring with Triangle Split Decomposition is NP-complete. 

Proof. It is well-known that 3-COLORING on general graphs is NP-complete [HI GT4], 
and it is trivial to see that the problem restricted to triangle split graphs is contained in 
NP. We show how to transform an instance G of 3-coloring in polynomial time into an 
equivalent instance of 3-coloring on a graph G' with a triangle split decomposition of V(G') 
into sets X' , Y'. Number the edges in G as ex, e%, - ■ • , e m . Construct the graph G' as follows: 

Set V(G') := V(G) U {a t ,b t ,c t \ i E [m]}. 
— Add the edges {ai,b{}, {b,,c,},{aj,c,} to E(G') for i e [m]. 

For each edge = {ui, Vi} (i £ [m]) of graph G, make vertex m adjacent in G' to a;, and 

make Vi adjacent to bi and c%. 

Define X' := V{G) and Y' := {a tl bi,Ci \ i S [m]}. 
This concludes the description of G' . It is easy to see that G' is a triangle split graph with 
the partition X' and Y' since G"[X'] is an independent set and G"[y'] is a disjoint union of 
triangles. We now show that x(G') < 3 if and only if x(G) < 3. 
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(=>■) Assume that x(G') < 3 and consider a 3-coloring of G . For every edge {ut, Vi] € E(G) 
we added a triangle on vertices {a^, Cj} to the graph G' . Hence G'[Nc>({ui, Vi})] contains 
an odd cycle for all pairs of vertices {ui,Vi} which are adjacent in G. By Lemma |20| this 
implies that Ui and Vi receive different colors in a 3-coloring of G' , and therefore the 3-coloring 
of G' restricted to the vertex set of G is a proper 3-coloring of G. 

Assume that G has a proper 3-coloring. We construct a 3-coloring for G' by coloring 
all vertices of V(G') n V(G) the same as in G; now all that remains is to color the triangles 
we added to the graph. If there is a triangle {a,, hi, Ci} for a pair {ui, Vi} then {v,i, Vi} are 
adjacent in G and hence they receive different colors in the proper coloring. Now give a* the 
color of Vi, give bi the color of Ui and give Cj the remaining color. If we do this for every 
triangle then we obtain a proper 3-coloring of G' which proves that x(G') < 3. 

Since the instance (G',X',Y') can be built from G in polynomial time this proves that 

3-COLORING WITH TRIANGLE SPLIT DECOMPOSITION is NP-COmplete. < 

► Lemma 22. Let (Gi, Xi, Y x ), . . . , (G t , X t , Y t ) be input instances of 3-Coloring with 
Triangle Split Decomposition which are mapped to the instance (G',Z',i') of Chro- 
matic Number parameterized by vertex cover according to the construction of Theorem \l^ 
Then X (G') < f 3i e [t] : X (G l ) < 3. 

Proof. Throughout the proof we will refer to the structural claims made about the graph G' 
in the proof of Theorem [15] 



(=>) Suppose x(G') < i! and consider some proper ^'-coloring of G 1 . By (lb I we know 
that for each i £ [logt] exactly one vertex of the pair {q l ,q\} receives the same color as ft. 
Consider the string of logt bits where the i-th most significant bit is a 1 if and only if 
vertex q\ receives the same color as Pi. This bitstring encodes some integer i* € [t]. We focus 
on the instance with the number i* . Let Q be the set of vertices which contains for each 
pair {q l Q , q[} (i € [logt]) the unique vertex which is colored the same as pi. By the definition 
of G' we know that all vertices of Xi* are adjacent to all vertices of Q; hence in any proper 
coloring of G' the vertices of Xi* cannot use any colors which are used on {pi \ i £ [logt]}. 



By (Id I this implies that the coloring for G' can only use the colors of x,y,z on the vertices 
of X^ . By (JlcJ) the triangle vertices T* are also colored using only the colors of x,y,z. The 
graph G'[Xi* UT*] is isomorphic to the input graph Q* by |n]), and since the coloring of G' 
only uses the colors of x, y, z on these vertices, this shows that the coloring of G' restricted 
to the induced subgraph G'[Xi* U T*] is in fact a 3-coloring of graph Gi*, which proves 
that x(Gj») < 3 and establishes this direction of the equivalence. 

(<=) Suppose x{G%*) < 3 for some i* € [t]. We will construct a proper ^'-coloring of G' . 
Start by giving all vertices of the palette different colors. By ([n]) the graph G'[AV U T*] 
is isomorphic to Gi*. Re-label the colors in the 3-coloring of Gi* such that it uses the 
colors given to {x, y, z} in our partial ^"-coloring of G. Give a vertex v in the induced 
subgraph G'[Xi* U T*] the same color as the vertex in Gi* to which it is mapped by the 
isomorphism. Afterwards we have a proper partial ^'-coloring, where all vertices of the 
palette, all vertices of Xi*, and all triangle vertices of G' are colored. It remains to color 
the sets Xi for i ^ i*, and the pairs {q l ,q\}. For each i 6 [logt] we color the pair {ql,q\} as 
follows: if the i-th most significant bit of the binary representation of the number i* is a 1 
then we color q\ the same color as pi and we color q 1 ^ as w; if the bit is a then we do it the 
other way around. It is straight forward to verify that we do not create any monochromatic 
edges in this way. As the final step we have to color the sets Xi for i ^ i*; so consider 
some i G [t] with i =/= i* . The binary representation of the number i* must differ from the 
binary representation of i in at least one position; suppose they differ at position j. The 
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vertex of {(/q, q{} which matches the bit value of i* at position j was colored the same as pj, 
hence the other vertex of the pair must have been colored the same as w. Since the bit 
values differ, by the definition of adjacencies in G' we find that the vertices Xj are adjacent 
to the vertex of {qQ,q{} which is colored as w. Therefore the vertices of X; do not have any 
neighbors colored as pj, and since X; is an independent set we may color all vertices in it 
the same as pj . If we color all sets X, for i =/= i* in this way we obtain a proper f'-coloring 
of G' which proves that x(G') < £'. + 

B.3 Feedback Vertex Set parameterized by Co-cluster Deletion 
Distance 

In the proof of this section we will use the Feedback Vertex Set problem restricted to 
bipartite input graphs of girth at least six. An instance of the problem Feedback Vertex 
Set on Bipartite Graphs of Girth > 6 (FVS-BG6) is a tuple (G,X,Y,£) and consists 
of a bipartite graph G of girth at least 6 with bipartition of the vertex set into X U Y, and a 
target value £ and asks whether G has a feedback vertex set of size at most £. 

► Observation 1. Feedback Vertex Set on Bipartite Graphs of Girth > 6 is NP- 
complete. This follows from the fact that a normal instance of Feedback Vertex Set can 
be reduced to an equivalent instance on a bipartite graph of girth at least six by subdividing 
each edge with three new degree-2 vertices. 

Proof of Theorem [16j We prove the theorem by showing that Feedback Vertex Set 
on Bipartite Graphs of Girth > 6 cross-composes into Feedback Vertex Set 
parameterized by deletion distance to co-cluster graphs (FVS-DCC); by Observation [I] and 
Corollary [lO] this is sufficient to establish the claim. We start by defining a polynomial 
equivalence relation for our input instances. Using a standard encoding (such as an adjacency- 
list) it is easy to verify the bipartition and the bound on the girth in polynomial time. 
Hence we can test in polynomial time whether an instance is well-formed. We define our 
polynomial equivalence relation 1Z such that all malformed instances are equivalent, and two 
well- formed instances (Gi,Xi,Yi,£i) and (G 2 , X 2l Y 2 , £2) are equivalent if and only if they 
satisfy |Xi| = |X 2 |, |Yi| = \Y 2 \ and £\ = £ 2 . From this definition it follows that any set of 
well-formed instances on at most n vertices each, is partitioned into 0(n 3 ) equivalence classes. 
Since all malformed instances are in one class, this proves that 1Z is indeed a polynomial 
equivalence relation. 

We now give a cross-composition algorithm which composes t input instances x\, . . . , Xt 
which are equivalent under 1Z into a single instance of FVS-DCC. If the input instances 
are malformed then we may output a single constant-size NO instance of FVS-DCC; 
hence in the remainder we may assume that all inputs are well-formed and encode struc- 
tures (Gi , Xi , Yi , £) , . . . , (G t , X t ,Y t ,£) which all agree on the value of £ and for which |Xj| = 
\Xj\ and \Yi\ — \YA for all i,j € [t]. We now construct in polynomial time a single in- 
stance (C , Z' ,£' , k') of FVS-DCC which is YES if and only if one of the input instances 
is yes, and such that k' = \Z'\ is bounded by |Yi|; since the maximum size of an input 
instance is at least | Y± \ this will show that the parameter size satisfies the requirements for a 
cross-composition. 

We construct the graph G' starting from a disjoint union of the graphs Gi. We label the 
vertices in each set Y{ arbitrarily from 1 to n, and then identify the vertex sets Y\, . . . , Y t 
to a new vertex set Y*\ we identify the first vertex of each set into one new vertex, the 
second vertex of each set, etc. We add all edges between vertex sets X, Xj for all i ^ j. We 
observe that G' [Xi U • • • U X t ] is a co-cluster graph. Thus G' has a deletion distance of at 
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most \Y*\ — \Yi\ to co-cluster graphs. Let £' := (t— l)\Xi\+£, let the deletion set to co-cluster 
graphs be Z' := Y* which implies that the parameter to this problem is k' := \Z'\ = \Y*\. 
From this construction it follows that Gi is isomorphic to G'[Xi U Y*]. It remains to prove 
correctness of the cross-composition: the output instance (G 1 , Z' ,£', k') is YES if and only if 
one of the input instances is yes. 

(<=) Let us first assume that some instance, say Xi, is yes and let S be a feedback vertex 
set of d of size at most I. Let S' :— S U Uj^i Xj, It is easy to see that G' — S' = Gi — S 
and that \S'\ < (t — l)\Xi\ + 1. Thus the output instance is yes. 

(=>) Let us now assume that the output instance is yes and let S' be a feedback vertex 
set for G' of size at most £' — (t — l)\Xx\ + I. We first observe that 5" must completely 
contain almost all sets Xi. Indeed, if there are three sets Xi lt Xi 2 , Xi 3 S' then G' — S' 
contains a triangle since we added all edges between different sets Xi, Xj. 

If there is exactly one set Xi with Xi S' then S' contains {Jj^Xj. Letting S :— 
S' \ Uj&Xj we observe that G' — S' = (G" - {Jj&Xj) - S = d - S. Thus S is a 
feedback vertex set of Gi, since Gi — S = G' — S' is acyclic by choice of 5". Furthermore, 
since [jj^Xj C S' we get that \S\ < \S'\ — (t— l)\Xi\ < I. Thus Xj is a YES-instance of 
FVS-BG6. 

It remains to consider the case that there are two sets Xi,Xj with Xi,Xj ^ S'. If S' 
misses at least two vertices in each of the two sets then G' — S' would contain a cycle of length 
four. Thus we assume w.l.o.g. that \Xj \S'\ = 1 and we let u denote the vertex of Xj that is 
not in 5". We recall that u is adjacent to all vertices of Xi. If S' does not contain any vertex 
from XiUY* then G t = G'[X t UY*} is acyclic. In that case Xi is a YES-instance of FVS-BG6 
and we are done. Otherwise let v £ S' H (X t U Y*) and let 5" = (S" \ {v}) U {u}. We will 
show that S" is a feedback vertex set of G' (of size at most £') and with Uj^i Xj C S" which 
permits us to reuse the argument from the previous paragraph to show that Xi is yes. 

We assume for contradiction that G' — S" is not acyclic. Thus there must be a cycle C 
which contains the vertex v. Since Uj>y Xj C S" the cycle C is contained in a copy of Gi 
implying that it has length of at least six. We let C — (. . . ,p, q, v, r, s, . . . ) and consider two 
cases: 

>m v S Y*: Since Gi is bipartite, the vertices q and r must be in Xi and are adjacent 
to u G Xj by construction. Thus C = (. . . ,p,q,u,r, s, . . . ) would be a cycle in G' — S'. 
A contradiction. 

m » £ X,: In this case p and s must be in Xi and adjacent to u G Xj, implying 

that C = (. . . ,p, u, s, . . . ) would be a cycle in G' — S' . A contradiction. 
Thus S" is a feedback vertex set for G' of size at most £' and with U^-j Xj C S". By the 
previous argumentation this implies that Xi is a yes instance. 

Since it is easy to verify that the instance (G' , Z' ,£' , k') can be constructed in poly- 
nomial time from the input instances, this establishes all components required for the 
cross-composition and concludes the proof. -4 

B.4 Feedback Vertex Set parameterized by Cluster Deletion Distance 

► Definition 23. The AVin-a-box graph Bk± (see Figure [2]) is the graph obtained from 
a complete graph on 4 vertices {a, b, c, d} by adding a new degree-2 vertex v for each 
pair {a, b}, {b, c}, {c, d}, {d, a} such that v is adjacent to both vertices of the pair. The 
vertices {a, c} are the 0-labeled terminals of the graph, and the vertices {b, d} are the 1- 
labeled terminals of the graph. 
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Figure 2 The iCi-in-a-box graph Bk 4 with labeled vertices. 

It is straight-forward to verify that any feedback vertex set for Bk a has size at least 2, and 
that a size-2 feedback vertex set contains either the 0-labeled terminals or the 1-labeled 
terminals. 

Proof of Theorem [T7J We give a cross-composition from Independent Set to Feedback 
Vertex Set parameterized by deletion distance from cluster graphs. Let x\,.. . ,Xt be t 
instances of Independent Set each coming with a graph Gi on n vertices and m edges and 
asking for an independent set of size at least £. W.l.o.g. we assume t to be a power of two. 
We also consider the vertices of the graphs Gi to be numbered arbitrarily from 1 to n. For 
the cross-composed instance we construct a graph G' as follows: 

1. We add an independent set on n vertices, labeled V\, . . . , v n . Let B denote this independent 
set. It is intended to encode the selection of a feedback vertex set. 

2. To build an instance selector we use logt copies of the B^ A graph. Each copy corresponds 
to one of the log t bit positions necessary to express numbers from 1 to t (by convention t 
corresponds to ... 0). The idea is to encode instance selection by forcing either the 
two 0-labeled terminals or the two 1-labeled terminals into the feedback vertex set. We 
make a total of n copies of this construction. 

3. For each instance Xi and any edge {p, q} of Gi we make the following construction which 
is intended to check edges of the selected instance: 

a. Add a clique on logt + 2 new vertices u>i, . . . , w\ og t, w ou t, w ln . (The vertex u> out will 
be adjacent to vertices outside of the clique, namely in the independent set; vertex Wi n 
will only have neighbors inside the edge checker.) 

b. For each bit position j g [logt], connect the vertex Wj to the two vertices labeled in 
the j-th Bk 4 graph of each instance selector if the j-th bit of i is zero and to those 
with label 1 otherwise. 

c. Connect u> out to p and q in B. 

We add n + 2 disjoint copies of this construction for each of the m edges of each of the t 
graphs. 

We define £' := 2n log t+ (n+2)tm log t + (n — £) and we let Z' contain B as well as the Sn log t 
vertices of the instance selectors. Clearly G' — Z' is a disjoint union of cliques (namely the 
edge checkers), and the size of Z' is k' := n + 871 logt. The cross-composition creates the 
instance x' — (G' , Z' ',£' ', k'), i.e., it asks whether the graph G' has a feedback vertex set of 
size at most £', and provides a deletion set Z' (of size k') such that G' — Z' is a disjoint 
union of cliques. Clearly the parameter value k' is polynomial in max + logt fulfilling 
the definition of a cross-composition. It is also easy to see that the construction can be 
performed in polynomial time. We will now argue correctness of the cross-composition, i.e., 
we will show that G' has a feedback vertex set of size at most £' if and only if at least one of 
the graphs Gi has an independent set of size at least £. 
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It is helpful to observe that any feedback vertex set of G' contains at least 2 log t vertices 
from each instance selector (each of the logi Bk 4 graphs has two disjoint cycles) and at 
least log t vertices from each edge checker (since each of them is a clique on log t + 2 vertices) . 
Thus the minimum size of any feedback vertex set for G' is at least 2nlogt + (n + 2)tmlogt. 

(<=) We begin by assuming that some instance, say x^ , is yes, i.e., that GV has an 
independent set of size at least £. Let S be an independent set of size £ in Gi* ; we will 
construct a feedback vertex set S' of G' of size at most £': 

1. In the independent set B we select for 5" all vertices except for the £ vertices from S, for 
a total of n — I vertices. 

2. In each j-th Bk 4 graph in any instance selector, select the two OTabeled terminals if 
the j-th bit in the binary expansion of i* is one, and select the two 1-labeled terminals 
otherwise. Thus we pick 21ogi vertices per selector, i.e., 2nlogt vertices in total; clearly 
deleting these vertices takes care of any cycles inside the instance selectors. 

3. In each edge checker that does not belong to x^, say it corresponds to some instance Xi, 
we pick all vertices except for w; n and some vertex, say Wj, where the binary expansions 
of i* and i differ. Thus Wj will not have neighbors in the instance selectors in G' — S'. 
We pick a total of (n +2)(t— l)mlogt vertices; skipping w in and some single Wj in each 
of these edge checkers. 

4. For the edge checkers that correspond to x^ we select all vertices except u> ut and Wi n . 
These two remaining vertices have no neighbors in the instance selectors. Furthermore, 
in G' — S' the vertices w out have degree one since they are adjacent to the endpoints 
of some edge {p, q} from G^* but we picked all n vertices except for those from the 
independent set S, which cannot cannot contain both p and q. Thus we pick a total 
of 2tm vertices. 

Thus the set S' is a feedback vertex set of G 1 of size 2nlogi + (n + 2)tmlogt + (n — £) = £', 
proving that the cross-composed instance is yes too. 

(=>) Let us now assume that the cross-composed instance is yes and let S' be a feedback 
vertex set of size £' = 2nlogi + (n + 2)tmlogt + (n — i) for G' . If I = then trivially all 
instances Xi are YES and there would be nothing to show; so assume £ > 0. Since S' contains 
at least 2 log t vertices from each instance selector and at least log t vertices from each edge 
checker, it can contain at most n — I vertices of the independent set B. Let S denote those 
vertices of B that were not chosen by S'; clearly |S , |>^as|B| = n. 

We first observe that S' cannot select more than 2 vertices per graph from s Bk 4 
graphs in instance selectors for any s > n. Otherwise, using the lower bounds for S' 
on instance selectors and edge checkers, this would imply that the size of S' is greater 
than 2nlogt + (n + 2)tm\ogt + (n — £); a contradiction to the choice of S': 

2nlogi+ (n + 2)tmlogt + s > 2nlogt + (n + 2)tmlogt + {n-£), 

since s > n > n — £. By the same argumentation S' cannot select more than log t vertices in n 
or more edge checkers. Now, considering that G' contains n copies of the instance selector, 
there must be at least one copy where S' selects exactly 2 vertices in each of the log t Bk 4 
graphs. Let i* € {1, ■••,*} be the number whose inverted binary expansion matches that 
selection of S' (i.e., if S' contains the two 0- vertices then the j-th bit of i* must be one); 
again, by convention t matches ... 0. We will show that S constitutes an independent set 
for G,. . 

We begin by showing that S' does not contain the w out -vertex of at least two edge 
checkers of any edge {p, q} of Gi* (recall that w ou t is adjacent to p, q e B in G"). The reason 
is that each -vertex is connected to two terminals of a Bk 4 graph in which S' selected 
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the other two terminals, by which Wj is not disconnected. Therefore, S' must contain also 
all uij- vertices in edge checkers for the graph Gi* , else there would be a cycle through a Bk 4 
graph and the Wj -vertex in G — S' . Hence, by the previous argument, there are two edge 
checkers for each edge of G,* where S' does not select w out - 

Let p, q G S; we show that Gj« does not contain the edge {p, q}: Assume for contradiction 
that {p, q} is an edge of G,* and recall that for at least two of the corresponding edge 
checkers S' does not contain u> ou t • Thus 5" must contain p or g, since otherwise there would 
be a cycle through p, q, and the iu out -vertices of the two checkers. This is a contradiction 
since S is defined to contain all vertices of B that are not in S' , i.e., S = B \ S' . Hence S is 
indeed an independent set of Gi* . < 

B.5 Weighted Feedback Vertex Set parameterized by Vertex Cover 

An instance of the Weighted Feedback Vertex Set parameterized by vertex cover 
problem is a tuple (G, Z,£,w,k) where G is a graph, Z is a vertex cover of G, k = \Z\, £ 
is a positive integer and w : V(G) —> N + is a weight function that assigns a positive 
integral weight to every vertex. The question is whether G has a feedback vertex set S such 
that £ ogS 

Proof of Theorem [18j We prove the theorem by showing that Feedback Vertex Set on 
Bipartite Graphs cross-composes into Weighted Feedback Vertex Set parameterized 
by a vertex cover. By a suitable choice of polynomial equivalence relation we may assume 
the input consists of well- formed instances (Gi, X\, Yj., £),..., (Gt, X t , Y t , £) which all agree 
on the number of vertices in \X\ and \Y\ and which have the same target value £. By the 
argument given in the proof of Theorem [15] we may assume that t is a power of 2. 

In each instance i £ [t] we number the vertices of Xi in an arbitrary way from 1 to \Xi\, 
and we also number from 1 to \Yj\. We construct a graph G' with weight function w' 
which has a vertex cover Z' of size k' := 8r + \Xi\, and which has a feedback vertex set of 
total weight £' :— r(2t|V 1 |) + (t — +£ if and only if one of the input graphs has a 

feedback vertex set of size £. We will define the weight function w' in an informal way, by 
describing the weights that various sets of vertices should receive. 

1. Add all independent sets Xi for i E [t] to the new graph G' , and give these vertices 
weight 1. 

2. Add a vertex set Y* = {yi, . . . , y\ Yl \} to the graph and give each vertex weight 1. For each 
set Xi with i € [t] and vertex v p € Xi which is numbered p, for each neighbor of v p in Gi 
numbered q add the edge {v p ,y q } to G'. Observe that afterwards the graph G'[Xi U Y*] 
is isomorphic to Gi for all i E [t]. 

3. We can represent an instance number in the range [t] using exactly log t bits since 
we assumed t is a power of 2. For each bit position j 6 [logt] we create a copy of 



the graph Bk a described in Definition 23 We label its O-terminal vertices {bj t o>, ^j,o"} 
and the 1-tcrminal vertices {bj t y, bj t "}. For each instance number i whose j-th most 
significant bit in the binary expansion is a 0, we make all vertices of Xi adjacent to 
the 0-terminals {&j,o') ^i,o"} ; an d for instance numbers whose bit value is 1 we make it 
adjacent to {bj,i> ,bj : y>}. We set the weight of each vertex in each copy of Bk 4 to £|Xl|. 
This concludes the description of the graph G' and weight function w' . Since a valid instance 
of Weighted Feedback Vertex Set parameterized by vertex cover also contains a vertex 
cover set Z' , we must supply such a vertex cover as part of the output of the procedure. 
It is easy to verify that if we let Z' contain the vertex set Y* and all vertices of each of 
the logt copies of Bk 4 then this forms a vertex cover of size \X\\ + 8 logt, hence we can 
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use this set as part of the output. The parameter value is the size of this vertex cover, and 
its value k' := \Z'\ = \X\\ + 8 logt is bounded by a polynomial in logt plus the size of the 
largest instance. It is easy to see that this construction can be carried out in polynomial 
time. It remains to prove that G' has a feedback vertex set of weight £' if and only if one of 
the input graphs has a feedback vertex set of size I. 

Assume that G' has a feedback vertex set S' of total weight at most logt(2t|Xi|) + 
(t — l)\Xi | + I, The graph G' contains logt vertex-disjoint copies of the graph Bk 4 - By 



Definition 23 the feedback vertex set S" must contain at least two vertices from each copy 
of Bk 4 - If there is some copy of Bk 4 from which S' contains more than two vertices, 
then this set must have weight at least 3t |^i | + (logt — l)2t|Ai|; but then the set S" 
which contains all O-terminal vertices of the copies of Bk a and the vertices U<=i Xi has 
weight logt(2t|Xi|) + t\X%\ which is at most as large. Hence by updating the set S' we may 



assume that it contains exactly two vertices from each copy of Bk 4 , and from Definition 23 
it then follows that for each copy it contains either the O-terminal vertices or the 1-terminal 
vertices. We now construct the binary representation of an instance number using the 
contents of S'. Let the j-th bit of the number be a 1 if set S' contains {&,• v, and a 

in the case that it contains {bj o 1 , 0"}i let i* denote the instance number in the range [t] 
which is represented by this bitstring. Observe that by the choice of i*, for all vertices 
in Xi* all of their neighbors in the Bj£ 4 graphs are contained in S' . On the other hand, if 
we consider some instance number i' ^ i* then there is at least one bit position where the 
representations of the numbers i' and i* differ. Let j be such a bit position and assume 
for the moment that the j-th bit of the number i* is a 1, which implies the j-th bit of i! 
is a (the other case is symmetric). Then S' contains the terminal vertices {bj t i>,bj t v} 
but does not contain {bj.o', bj,o"}- But then 5" must contain all vertices from the set X^, 
for if S' would avoid some vertex v G X{* then the graph G' — S' would contain a cycle 
on vertices {v, &j,o'i frj.o"} which contradicts the assumption that S' is a feedback vertex 
set. This shows that for all instance numbers i' ^ i* the set S' must contain all vertices 
of Xi/ . These vertices together with the 2 terminal vertices in each copy of Bk 4 account 
for logt(2t|Xi|) + (t — l)|-X"i| of the weight of S' , and therefore the remaining vertices in S' 
have weight at most l\ in particular the set S' contains at most I vertices from the set Xi* (JY* 
since each such vertex has weight 1. We observed earlier that the graph G'[Xi* U Y*] is 
isomorphic to Gi* . Since S' is a feedback vertex set for G' it must also break all cycles in 
all induced subgraphs, hence G'[Xi» U Y*] — S' is acyclic. But since S' contains at most I 
vertices from Xi- U Y* this proves that S' D (Xi* Ul") is a feedback vertex set of size at 
most £ for graph Gi* . 

(<=) Assume that Gi* has a feedback vertex set S of size £ for some input graph Gi* . We 
show how to construct a feedback vertex set S' for G' of weight at most £' = logt(2t\Xi\) + 
(t-l)\X L \+£. 

1. For each bit position j S [logt] add {bj t o',bj o"} to S' if the j-th bit of i* is a 0, and 
otherwise add {bj,v,bj t i"}. This contributes a total weight of r(2i|A"i|). 

2. Add the set [j^, Xi for a total weight of (t - 

3. Finally add the vertices from Xi* and Y* which correspond to the vertices in S; this adds 
a total weight of \S\ = t. 

Hence the resulting set S' has weight exactly £' . To see that 5" is indeed a feedback vertex 
set for G", observe that by taking two matching terminal vertices for each copy of Bk 4 we 
have broken all cycles within the Bk 4 graphs. For all sets Xy with i' yt i* we have taken all 
the Xi< vertices in S' so G' — S' cannot contain cycles through such sets Xy . By taking the 
appropriate terminal vertices in S' we have broken all connections between vertices in Xi* 
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and vertices in copies of Bk^- Finally there can be no cycles in G'[JQ* U Y*] — S' since we 
assume that S is a feedback vertex set for GV which is isomorphic to G'\X^ U Y*], and we 
have made the same choices as S to break all cycles in that induced subgraph. Hence S' is 
indeed a feedback vertex set of the desired weight. 

We have proven that our newly constructed instance (G' , Z' ,w' , k') indeed acts as the 
OR of instances x\, ... ,Xt- Since the output parameter k := \C\ is appropriately bounded, 
this shows the correctness of the cross-composition. By invoking Corollary^] this is sufficient 
to show that Weighted Feedback Vertex Set parameterized by a vertex cover does not 
admit a polynomial kernel unless NP C coNP/poly. -4 



